Upload gyp packages to S3 after building#17
Conversation
49840da to
b934c45
Compare
JackStrohm-asana
left a comment
There was a problem hiding this comment.
This approach breaks our rollback story. Currently the gyp binaries live in the codez repo directly, so rolling back a deploy automatically rolls back to the correct binary version. Fibers and the node tarballs preserve this property via unique S3 keys — when codez is rolled back, the URL in that revision still points to the exact binary that was tested with it.
With a fixed S3 key (packages_amd64_node22.tar.gz), every new build overwrites the previous one. If we roll back codez to a revision that references an older sha256, Bazel will fetch the current (newer) binary from that URL, get a sha256 mismatch, and fail. We'd have a broken build exactly when we need rollback to work.
The fix is the same pattern used for the other binaries: incorporate a short hash of the tarball content into the S3 key. That way each build produces a distinct, permanent S3 object, and the URL+sha256 pair in codez always refers to a specific binary that won't disappear or change.
After building native module packages (bcrypt, cld, unix-dgram, @datadog/pprof), upload them to s3://asana-oss-cache/node-gyp/v1/ in addition to the GitHub Release. This enables codez to fetch these packages via Bazel http_file instead of committing ~112 MB of tarballs to git, saving ~305 MB total per checkout (node18/node20 tarballs are dead code and will be deleted). Changes: - build-node-packages.yml: Add AWS OIDC auth + S3 upload step after release upload - stage_for_s3.bash: Separate packages_*.tar.gz before fibers loop to prevent them from being incorrectly mixed into the fibers archive Requires IAM role `push_node_gyp_packages` to be provisioned first (Asana/codez PR #388637). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b934c45 to
2b08492
Compare
harshita-gupta
left a comment
There was a problem hiding this comment.
Good catch — updated. Each tarball now gets a content-hashed S3 key using the first 8 chars of its sha256 (e.g., packages_amd64_node22-bb5ac136.tar.gz). This matches the pattern used for Node binaries and fibers: each build produces an immutable artifact, and the URL+sha256 pair in codez always refers to that specific binary.
Co-authored with Claude
JackStrohm-asana
left a comment
There was a problem hiding this comment.
Everything else looks good. Maybe at some point the other files that the script produces can automatically be uploaded to S3 as well.
…ck, remove --acl Three follow-up corrections to PR #17: 1. Remove `--acl public-read` from `aws s3 cp`. The bucket has `disable_confusing_acls = true` (BucketOwnerEnforced), which disables ACLs entirely. `BlockPublicAcls` + `IgnorePublicAcls` provide additional coverage. The ACL flag is silently ignored. The IAM role (`S3_ACCESS_MODE.PUT`) also doesn't grant `PutObjectAcl`. Reads go via CloudFront OAC, not public-S3. 2. Replace `softprops/action-gh-release` with GitHub's first-party `gh` CLI. `gh release upload` is pre-installed on GitHub-hosted runners, removes a third-party (single-maintainer) supply-chain dependency, and behaves equivalently with `--clobber`. 3. Add a post-upload CloudFront reachability check (`curl -fI`). If the CloudFront path_patterns allowlist doesn't include the key's prefix, Mac Bazel builds will silently 403. Failing the workflow here surfaces the issue before consumers hit it. S3 path stays `node-gyp/*` (this PR no longer changes it — see codez PR #390222 which adds `node-gyp/*` to CloudFront's path_patterns in system_packages.tf). Action pinning: tag-pinned per codez convention (100% of codez workflows use tags, not SHAs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ck, remove --acl (#18) * workflows: swap softprops for gh CLI, add CloudFront reachability check, remove --acl Three follow-up corrections to PR #17: 1. Remove `--acl public-read` from `aws s3 cp`. The bucket has `disable_confusing_acls = true` (BucketOwnerEnforced), which disables ACLs entirely. `BlockPublicAcls` + `IgnorePublicAcls` provide additional coverage. The ACL flag is silently ignored. The IAM role (`S3_ACCESS_MODE.PUT`) also doesn't grant `PutObjectAcl`. Reads go via CloudFront OAC, not public-S3. 2. Replace `softprops/action-gh-release` with GitHub's first-party `gh` CLI. `gh release upload` is pre-installed on GitHub-hosted runners, removes a third-party (single-maintainer) supply-chain dependency, and behaves equivalently with `--clobber`. 3. Add a post-upload CloudFront reachability check (`curl -fI`). If the CloudFront path_patterns allowlist doesn't include the key's prefix, Mac Bazel builds will silently 403. Failing the workflow here surfaces the issue before consumers hit it. S3 path stays `node-gyp/*` (this PR no longer changes it — see codez PR #390222 which adds `node-gyp/*` to CloudFront's path_patterns in system_packages.tf). Action pinning: tag-pinned per codez convention (100% of codez workflows use tags, not SHAs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Update .github/workflows/build-node-packages.yml Co-authored-by: Eli Skeggs <1348991+skeggse@users.noreply.github.com> * workflows: hoist matrix values to job-level env, drop expression substitution in run: blocks Eli's review flagged `${{ matrix.arch }}` in a run: block as an injectable pattern even though the matrix values are hardcoded and not truly exploitable. Apply the pattern consistently across the whole workflow: - Hoist PLATFORM, ARCH, BAZEL_ARCH, REPO to job-level env so each step can reference them as shell variables ($ARCH etc.) rather than GitHub Actions expressions (${{ matrix.arch }}). Job-level env evaluates matrix context since the job is instantiated per matrix combination, so this DRYs up the per-step env blocks. - Rewrite every `run:` block to reference the job-level env vars. No more `${{ ... }}` expressions inside shell scripts. - Secret references (GITHUB_TOKEN) remain step-scoped per least-privilege. - Minor cleanup: collapse three separate `echo ... >> $GITHUB_ENV` lines into a single `{ ...; } >> "$GITHUB_ENV"` block. Addresses Eli's inline comment on line 114 of the pre-hoist file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Harshita Gupta <harshita-gupta@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Eli Skeggs <1348991+skeggse@users.noreply.github.com>
Summary
After building native module packages (bcrypt, cld, unix-dgram, @datadog/pprof),
upload them to
s3://asana-oss-cache/node-gyp/v1/in addition to the GitHub Release.This enables codez to fetch these packages via Bazel
http_fileinstead ofcommitting ~112 MB of tarballs to git, saving ~305 MB total per checkout
(node18/node20 tarballs are dead code and will be deleted).
Changes
build-node-packages.yml:id-token: writepermissions for AWS OIDC authbazel_archmatrix field to mapx64→amd64for S3 namingconfigure-aws-credentialsstep using thepush_node_gyp_packagesIAM roletools_repositories.bzlsnippetstage_for_s3.bash:packages_*.tar.gzfiles before the fibers processing loopfind . -name "*.gz"loopPrerequisites
push_node_gyp_packagesmust be provisioned first: Asana/codez PR #388637z permissions iam pushin codez to create the AWS roleTest Plan
build-node-packages.ymlmanually from the Actions tabhttps://asana-oss-cache.s3.us-east-1.amazonaws.com/node-gyp/v1/packages_{arch}_node22.tar.gzCo-authored with Claude